285 research outputs found

    Improving Classifier Confidence using Lossy Label-Invariant Transformations

    Get PDF
    Providing reliable model uncertainty estimates is imperative to enabling robust decision making by autonomous agents and humans alike. While recently there have been significant advances in confidence calibration for trained models, examples with poor calibration persist in most calibrated models. Consequently, multiple techniques have been proposed that leverage label-invariant transformations of the input (i.e., an input manifold) to improve worst-case confidence calibration. However, manifold-based confidence calibration techniques generally do not scale and/or require expensive retraining when applied to models with large input spaces (e.g., ImageNet). In this paper, we present the recursive lossy label-invariant calibration (ReCal) technique that leverages label-invariant transformations of the input that induce a loss of discriminatory information to recursively group (and calibrate) inputs - without requiring model retraining. We show that ReCal outperforms other calibration methods on multiple datasets, especially, on large-scale datasets such as ImageNet

    Resilient Linear Classification: An Approach to Deal with Attacks on Training Data

    Get PDF
    Data-driven techniques are used in cyber-physical systems (CPS) for controlling autonomous vehicles, handling demand responses for energy management, and modeling human physiology for medical devices. These data-driven techniques extract models from training data, where their performance is often analyzed with respect to random errors in the training data. However, if the training data is maliciously altered by attackers, the effect of these attacks on the learning algorithms underpinning data-driven CPS have yet to be considered. In this paper, we analyze the resilience of classification algorithms to training data attacks. Specifically, a generic metric is proposed that is tailored to measure resilience of classification algorithms with respect to worst-case tampering of the training data. Using the metric, we show that traditional linear classification algorithms are resilient under restricted conditions. To overcome these limitations, we propose a linear classification algorithm with a majority constraint and prove that it is strictly more resilient than the traditional algorithms. Evaluations on both synthetic data and a real-world retrospective arrhythmia medical case-study show that the traditional algorithms are vulnerable to tampered training data, whereas the proposed algorithm is more resilient (as measured by worst-case tampering)

    Digital Health: Software as a Medical Device

    Get PDF
    Software, such as mobile device apps or telemedicine, creates exciting new opportunities for patient engagement and for improving healthcare. There are three main types of mobile apps: native, web, and hybrid. The wireless technologies in smartphones and wearable sensors, such as smartwatches, offer the potential for additional biometric data collection. HIPAA compliance requires multiple levels of oversight and auditing. The software development costs for healthcare are considerably higher than for consumer-oriented products due to FDA regulatory requirements; testing out proof-of-concept through low-cost alternatives is an important development strategy

    Context-Aware Detection in Medical Cyber-Physical Systems

    Get PDF
    This paper considers the problem of incorporating context in medical cyber-physical systems (MCPS) applications for the purpose of improving the performance of MCPS detectors. In particular, in many applications additional data could be used to conclude that actual measurements might be noisy or wrong (e.g., machine settings might indicate that the machine is improperly attached to the patient); we call such data context. The first contribution of this work is the formal definition of context, namely additional information whose presence is associated with a change in the measurement model (e.g., higher variance). Given this formulation, we developed the context-aware parameter-invariant (CA-PAIN) detector; the CA-PAIN detector improves upon the original PAIN detector by recognizing events with noisy measurements and not raising unnecessary false alarms. We evaluate the CA-PAIN detector both in simulation and on real-patient data; in both cases, the CA-PAIN detector achieves roughly a 20-percent reduction of false alarm rates over the PAIN detector, thus indicating that formalizing context and using it in a rigorous way is a promising direction for future work

    Towards Assurance Cases for Resilient Control Systems

    Get PDF
    The paper studies the problem of constructing assurance cases for embedded control systems developed using a model-based approach. Assurance cases aim to provide a convincing argument that the system delivers certain guarantees, based on the evidence obtained during the design and evaluation of the system. We suggest an argument strategy centered around properties of models used in the development and properties of tools that manipulate these models. The paper presents the case study of a resilient speed estimator for an autonomous ground vehicle and takes the reader through a detailed assurance case arguing that the estimator computes speed estimates with bounded error

    Guaranteed Conformance of Neurosymbolic Models to Natural Constraints

    Get PDF
    Deep neural networks have emerged as the workhorse for a large section of robotics and control applications, especially as models for dynamical systems. Such data-driven models are in turn used for designing and verifying autonomous systems. This is particularly useful in modeling medical systems where data can be leveraged to individualize treatment. In safety-critical applications, it is important that the data-driven model is conformant to established knowledge from the natural sciences. Such knowledge is often available or can often be distilled into a (possibly black-box) model M. For instance, the unicycle model for an F1 racing car. In this light, we consider the following problem - given a model M and state transition dataset, we wish to best approximate the system model while being bounded distance away from M. We propose a method to guarantee this conformance. Our first step is to distill the dataset into few representative samples called memories, using the idea of a growing neural gas. Next, using these memories we partition the state space into disjoint subsets and compute bounds that should be respected by the neural network, when the input is drawn from a particular subset. This serves as a symbolic wrapper for guaranteed conformance. We argue theoretically that this only leads to bounded increase in approximation error; which can be controlled by increasing the number of memories. We experimentally show that on three case studies (Car Model, Drones, and Artificial Pancreas), our constrained neurosymbolic models conform to specified M models (each encoding various constraints) with order-of-magnitude improvements compared to the augmented Lagrangian and vanilla training methods

    Curating Naturally Adversarial Datasets for Trustworthy AI in Healthcare

    Full text link
    Deep learning models have shown promising predictive accuracy for time-series healthcare applications. However, ensuring the robustness of these models is vital for building trustworthy AI systems. Existing research predominantly focuses on robustness to synthetic adversarial examples, crafted by adding imperceptible perturbations to clean input data. However, these synthetic adversarial examples do not accurately reflect the most challenging real-world scenarios, especially in the context of healthcare data. Consequently, robustness to synthetic adversarial examples may not necessarily translate to robustness against naturally occurring adversarial examples, which is highly desirable for trustworthy AI. We propose a method to curate datasets comprised of natural adversarial examples to evaluate model robustness. The method relies on probabilistic labels obtained from automated weakly-supervised labeling that combines noisy and cheap-to-obtain labeling heuristics. Based on these labels, our method adversarially orders the input data and uses this ordering to construct a sequence of increasingly adversarial datasets. Our evaluation on six medical case studies and three non-medical case studies demonstrates the efficacy and statistical validity of our approach to generating naturally adversarial dataset

    Guaranteed Conformance of Neurosymbolic Models to Natural Constraints

    Get PDF
    Deep neural networks have emerged as the workhorse for a large section of robotics and control applications, especially as models for dynamical systems. Such data-driven models are in turn used for designing and verifying autonomous systems. This is particularly useful in modeling medical systems where data can be leveraged to individualize treatment. In safety-critical applications, it is important that the data-driven model is conformant to established knowledge from the natural sciences. Such knowledge is often available or can often be distilled into a (possibly black-box) model MM. For instance, the unicycle model for an F1 racing car. In this light, we consider the following problem - given a model MM and state transition dataset, we wish to best approximate the system model while being bounded distance away from MM. We propose a method to guarantee this conformance. Our first step is to distill the dataset into few representative samples called memories, using the idea of a growing neural gas. Next, using these memories we partition the state space into disjoint subsets and compute bounds that should be respected by the neural network, when the input is drawn from a particular subset. This serves as a symbolic wrapper for guaranteed conformance. We argue theoretically that this only leads to bounded increase in approximation error; which can be controlled by increasing the number of memories. We experimentally show that on three case studies (Car Model, Drones, and Artificial Pancreas), our constrained neurosymbolic models conform to specified MM models (each encoding various constraints) with order-of-magnitude improvements compared to the augmented Lagrangian and vanilla training methods

    Calibrated Prediction with Covariate Shift via Unsupervised Domain Adaptation

    Get PDF
    Reliable uncertainty estimates are an important tool for helping autonomous agents or human decision makers understand and leverage predictive models. However, existing approaches to estimating uncertainty largely ignore the possibility of covariate shift—i.e., where the real-world data distribution may differ from the training distribution. As a consequence, existing algorithms can overestimate certainty, possibly yielding a false sense of confidence in the predictive model. We propose an algorithm for calibrating predictions that accounts for the possibility of covariate shift, given labeled examples from the training distribution and unlabeled examples from the real-world distribution. Our algorithm uses importance weighting to correct for the shift from the training to the real-world distribution. However, importance weighting relies on the training and real-world distributions to be sufficiently close. Building on ideas from domain adaptation, we additionally learn a feature map that tries to equalize these two distributions. In an empirical evaluation, we show that our proposed approach outperforms existing approaches to calibrated prediction when there is covariate shift
    • …
    corecore